Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 14 de 14
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Cell Rep Methods ; 3(8): 100543, 2023 08 28.
Artigo em Inglês | MEDLINE | ID: mdl-37671027

RESUMO

The human pangenome, a new reference sequence, addresses many limitations of the current GRCh38 reference. The first release is based on 94 high-quality haploid assemblies from individuals with diverse backgrounds. We employed a k-mer indexing strategy for comparative analysis across multiple assemblies, including the pangenome reference, GRCh38, and CHM13, a telomere-to-telomere reference assembly. Our k-mer indexing approach enabled us to identify a valuable collection of universally conserved sequences across all assemblies, referred to as "pan-conserved segment tags" (PSTs). By examining intervals between these segments, we discerned highly conserved genomic segments and those with structurally related polymorphisms. We found 60,764 polymorphic intervals with unique geo-ethnic features in the pangenome reference. In this study, we utilized ultra-conserved sequences (PSTs) to forge a link between human pangenome assemblies and reference genomes. This methodology enables the examination of any sequence of interest within the pangenome, using the reference genome as a comparative framework.


Assuntos
Neoplasias de Células Escamosas , Neoplasias Cutâneas , Humanos , Sequência Conservada , Haploidia , Polimorfismo Genético
2.
J Transl Med ; 21(1): 378, 2023 06 10.
Artigo em Inglês | MEDLINE | ID: mdl-37301971

RESUMO

BACKGROUND: Diagnosis of rare genetic diseases can be a long, expensive and complex process, involving an array of tests in the hope of obtaining an actionable result. Long-read sequencing platforms offer the opportunity to make definitive molecular diagnoses using a single assay capable of detecting variants, characterizing methylation patterns, resolving complex rearrangements, and assigning findings to long-range haplotypes. Here, we demonstrate the clinical utility of Nanopore long-read sequencing by validating a confirmatory test for copy number variants (CNVs) in neurodevelopmental disorders and illustrate the broader applications of this platform to assess genomic features with significant clinical implications. METHODS: We used adaptive sampling on the Oxford Nanopore platform to sequence 25 genomic DNA samples and 5 blood samples collected from patients with known or false-positive copy number changes originally detected using short-read sequencing. Across the 30 samples (a total of 50 with replicates), we assayed 35 known unique CNVs (a total of 55 with replicates) and one false-positive CNV, ranging in size from 40 kb to 155 Mb, and assessed the presence or absence of suspected CNVs using normalized read depth. RESULTS: Across 50 samples (including replicates) sequenced on individual MinION flow cells, we achieved an average on-target mean depth of 9.5X and an average on-target read length of 4805 bp. Using a custom read depth-based analysis, we successfully confirmed the presence of all 55 known CNVs (including replicates) and the absence of one false-positive CNV. Using the same CNV-targeted data, we compared genotypes of single nucleotide variant loci to verify that no sample mix-ups occurred between assays. For one case, we also used methylation detection and phasing to investigate the parental origin of a 15q11.2-q13 duplication with implications for clinical prognosis. CONCLUSIONS: We present an assay that efficiently targets genomic regions to confirm clinically relevant CNVs with a concordance rate of 100%. Furthermore, we demonstrate how integration of genotype, methylation, and phasing data from the Nanopore sequencing platform can potentially simplify and shorten the diagnostic odyssey.


Assuntos
Sequenciamento por Nanoporos , Humanos , Variações do Número de Cópias de DNA/genética , Fluxo de Trabalho , Genômica , Análise de Sequência de DNA , Sequenciamento de Nucleotídeos em Larga Escala
3.
Evolution ; 77(2): 454-466, 2023 02 04.
Artigo em Inglês | MEDLINE | ID: mdl-36625708

RESUMO

Evolution of self-fertilization may be initiated by a historical population bottleneck, which should diagnostically reduce lineage-wide genetic variation. However, selfing can also strongly reduce genetic variation after it evolves. Distinguishing process from pattern is less problematic if mating system divergence is recent and geographically simple. Dramatically reduced diversity is associated with the transition from outcrossing to selfing in the Pacific coastal endemic Abronia umbellata that includes large-flowered, self-incompatible populations (var. umbellata) south of San Francisco Bay and small-flowered, autogamous populations (var. breviflora) to the north. Compared to umbellata, synonymous nucleotide diversity across 10 single-copy nuclear genes was reduced by 94% within individual populations and 90% across the whole selfing breviflora lineage, which contained no unique polymorphisms. The geographic pattern of genetic variation is consistent with a single origin of selfing that occurred recently (7-28 kya). These results are best explained by a historical bottleneck, but the two most northerly umbellata populations also contained little variation and clustered with selfing populations, suggesting that substantial diversity loss preceded the origin of selfing. A bottleneck may have set the stage for the eventual evolution of selfing by purging genetic load that prevents the spread of selfing.


Assuntos
Reprodução , Autofertilização , Polimorfismo Genético , Plantas , Flores/genética
4.
Sci Rep ; 12(1): 10333, 2022 06 20.
Artigo em Inglês | MEDLINE | ID: mdl-35725745

RESUMO

Autophagy is a housekeeping mechanism tasked with eliminating misfolded proteins and damaged organelles to maintain cellular homeostasis. Autophagy deficiency results in increased oxidative stress, DNA damage and chronic cellular injury. Among the core genes in the autophagy machinery, ATG7 is required for autophagy initiation and autophagosome formation. Based on the analysis of an extended pedigree of familial cholangiocarcinoma, we determined that all affected family members had a novel germline mutation (c.2000C>T p.Arg659* (p.R659*)) in ATG7. Somatic deletions of ATG7 were identified in the tumors of affected individuals. We applied linked-read sequencing to one tumor sample and demonstrated that the ATG7 somatic deletion and germline mutation were located on distinct alleles, resulting in two hits to ATG7. From a parallel population genetic study, we identified a germline polymorphism of ATG7 (c.1591C>G p.Asp522Glu (p.D522E)) associated with increased risk of cholangiocarcinoma. To characterize the impact of these germline ATG7 variants on autophagy activity, we developed an ATG7-null cell line derived from the human bile duct. The mutant p.R659* ATG7 protein lacked the ability to lipidate its LC3 substrate, leading to complete loss of autophagy and increased p62 levels. Our findings indicate that germline ATG7 variants have the potential to impact autophagy function with implications for cholangiocarcinoma development.


Assuntos
Proteína 7 Relacionada à Autofagia , Neoplasias dos Ductos Biliares , Colangiocarcinoma , Proteínas de Ligação a RNA , Autofagia/genética , Proteína 7 Relacionada à Autofagia/genética , Neoplasias dos Ductos Biliares/genética , Ductos Biliares Intra-Hepáticos , Colangiocarcinoma/genética , Células Germinativas/metabolismo , Humanos , Proteínas de Ligação a RNA/genética
5.
Nucleic Acids Res ; 50(W1): W448-W453, 2022 07 05.
Artigo em Inglês | MEDLINE | ID: mdl-35474383

RESUMO

K-mers are short DNA sequences that are used for genome sequence analysis. Applications that use k-mers include genome assembly and alignment. However, the wider bioinformatic use of these short sequences has challenges related to the massive scale of genomic sequence data. A single human genome assembly has billions of k-mers. As a result, the computational requirements for analyzing k-mer information is enormous, particularly when involving complete genome assemblies. To address these issues, we developed a new indexing data structure based on a hash table tuned for the lookup of short sequence keys. This web application, referred to as KmerKeys, provides performant, rapid query speeds for cloud computation on genome assemblies. We enable fuzzy as well as exact sequence searches of assemblies. To enable robust and speedy performance, the website implements cache-friendly hash tables, memory mapping and massive parallel processing. Our method employs a scalable and efficient data structure that can be used to jointly index and search a large collection of human genome assembly information. One can include variant databases and their associated metadata such as the gnomAD population variant catalogue. This feature enables the incorporation of future genomic information into sequencing analysis. KmerKeys is freely accessible at https://kmerkeys.dgi-stanford.org.


Assuntos
Algoritmos , Análise de Sequência de DNA , Software , Humanos , Genoma Humano , Genômica/métodos , Análise de Sequência de DNA/métodos
6.
Genome Med ; 13(1): 145, 2021 09 06.
Artigo em Inglês | MEDLINE | ID: mdl-34488871

RESUMO

We developed a sensitive sequencing approach that simultaneously profiles microsatellite instability, chromosomal instability, and subclonal structure in cancer. We assessed diverse repeat motifs across 225 microsatellites on colorectal carcinomas. Our study identified elevated alterations at both selected tetranucleotide and conventional mononucleotide repeats. Many colorectal carcinomas had a mix of genomic instability states that are normally considered exclusive. An MSH3 mutation may have contributed to the mixed states. Increased copy number of chromosome arm 8q was most prevalent among tumors with microsatellite instability, including a case of translocation involving 8q. Subclonal analysis identified co-occurring driver mutations previously known to be exclusive.


Assuntos
Instabilidade Cromossômica , Cromossomos Humanos Par 8 , Neoplasias Colorretais/genética , Reparo de Erro de Pareamento de DNA , Genótipo , Humanos , Repetições de Microssatélites , Proteína 3 Homóloga a MutS/genética , Proteína 3 Homóloga a MutS/metabolismo , Proteínas de Neoplasias/genética , Sequenciamento Completo do Genoma
7.
NAR Cancer ; 3(4): zcab049, 2021 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-34988460

RESUMO

Dysbioisis is an imbalance of an organ's microbiome and plays a role in colorectal cancer pathogenesis. Characterizing the bacteria in the microenvironment of a cancer through genome sequencing has advantages compared to culture-based profiling. However, there are notable technical and analytical challenges in characterizing universal features of tumor microbiomes. Colorectal tumors demonstrate microbiome variation among different studies and across individual patients. To address these issues, we conducted a computational study to determine a consensus microbiome for colorectal cancer, analyzing 924 tumors from eight independent RNA-Seq data sets. A standardized meta-transcriptomic analysis pipeline was established with quality control metrics. Microbiome profiles across different cohorts were compared and recurrently altered microbial shifts specific to colorectal cancer were determined. We identified cancer-specific set of 114 microbial species associated with tumors that were found among all investigated studies. Firmicutes, Bacteroidetes, Proteobacteria and Actinobacteria were among the four most abundant phyla for the colorectal cancer microbiome. Member species of Clostridia were depleted and Fusobacterium nucleatum was one of the most enriched bacterial species in tumors. Associations between the consensus species and specific immune cell types were noted. Our results are available as a web data resource for other researchers to explore (https://crc-microbiome.stanford.edu).

8.
Sci Rep ; 10(1): 5009, 2020 03 19.
Artigo em Inglês | MEDLINE | ID: mdl-32193467

RESUMO

DNA copy number aberrations (CNA) are frequently observed in colorectal cancers (CRC). There is an urgent need for CNA-based biomarkers in clinics,. n For Stage III CRC, if combined with imaging or pathologic evidence, these markers promise more precise care. We conducted this Stage III specific biomarker discovery with a cohort of 134 CRCs, and with a newly developed high-efficiency CNA profiling protocol. Specifically, we developed the profiling protocol for tumor-normal matched tissue samples based on low-coverage clinical whole-genome sequencing (WGS). We demonstrated the protocol's accuracy and robustness by a systematic benchmark with microarray, high-coverage whole-exome and -genome approaches, where the low-coverage WGS-derived CNA segments were highly accordant (PCC >0.95) with those derived from microarray, and they were substantially less variable if compared to exome-derived segments. A lasso-based model and multivariate cox regression analysis identified a chromosome 17p loss, containing the TP53 tumor suppressor gene, that was significantly associated with reduced survival (P = 0.0139, HR = 1.688, 95% CI = [1.112-2.562]), which was validated by an independent cohort of 187 Stage III CRCs. In summary, this low-coverage WGS protocol has high sensitivity, high resolution and low cost and the identified 17p-loss is an effective poor prognosis marker for Stage III patients.


Assuntos
Biomarcadores Tumorais , Neoplasias Colorretais/genética , Neoplasias Colorretais/mortalidade , Variações do Número de Cópias de DNA/genética , Deleção de Genes , Proteína Supressora de Tumor p53/genética , Sequenciamento Completo do Genoma/métodos , Adulto , Idoso , Idoso de 80 Anos ou mais , Deleção Cromossômica , Cromossomos Humanos Par 17/genética , Feminino , Marcadores Genéticos , Humanos , Masculino , Pessoa de Meia-Idade , Estadiamento de Neoplasias , Prognóstico , Síndrome de Smith-Magenis/diagnóstico , Síndrome de Smith-Magenis/genética , Taxa de Sobrevida , Adulto Jovem
9.
Nucleic Acids Res ; 47(19): e115, 2019 11 04.
Artigo em Inglês | MEDLINE | ID: mdl-31350896

RESUMO

The human genome is composed of two haplotypes, otherwise called diplotypes, which denote phased polymorphisms and structural variations (SVs) that are derived from both parents. Diplotypes place genetic variants in the context of cis-related variants from a diploid genome. As a result, they provide valuable information about hereditary transmission, context of SV, regulation of gene expression and other features which are informative for understanding human genetics. Successful diplotyping with short read whole genome sequencing generally requires either a large population or parent-child trio samples. To overcome these limitations, we developed a targeted sequencing method for generating megabase (Mb)-scale haplotypes with short reads. One selects specific 0.1-0.2 Mb high molecular weight DNA targets with custom-designed Cas9-guide RNA complexes followed by sequencing with barcoded linked reads. To test this approach, we designed three assays, targeting the BRCA1 gene, the entire 4-Mb major histocompatibility complex locus and 18 well-characterized SVs, respectively. Using an integrated alignment- and assembly-based approach, we generated comprehensive variant diplotypes spanning the entirety of the targeted loci and characterized SVs with exact breakpoints. Our results were comparable in quality to long read sequencing.


Assuntos
Genoma Humano/genética , Variação Estrutural do Genoma/genética , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Sequenciamento Completo do Genoma/métodos , Diploide , Regulação da Expressão Gênica/genética , Estudos de Associação Genética/métodos , Haplótipos/genética , Humanos , Análise de Sequência de DNA/métodos
10.
Nucleic Acids Res ; 47(8): 3846-3861, 2019 05 07.
Artigo em Inglês | MEDLINE | ID: mdl-30864654

RESUMO

HepG2 is one of the most widely used human cancer cell lines in biomedical research and one of the main cell lines of ENCODE. Although the functional genomic and epigenomic characteristics of HepG2 are extensively studied, its genome sequence has never been comprehensively analyzed and higher order genomic structural features are largely unknown. The high degree of aneuploidy in HepG2 renders traditional genome variant analysis methods challenging and partially ineffective. Correct and complete interpretation of the extensive functional genomics data from HepG2 requires an understanding of the cell line's genome sequence and genome structure. Using a variety of sequencing and analysis methods, we identified a wide spectrum of genome characteristics in HepG2: copy numbers of chromosomal segments at high resolution, SNVs and Indels (corrected for aneuploidy), regions with loss of heterozygosity, phased haplotypes extending to entire chromosome arms, retrotransposon insertions and structural variants (SVs) including complex and somatic genomic rearrangements. A large number of SVs were phased, sequence assembled and experimentally validated. We re-analyzed published HepG2 datasets for allele-specific expression and DNA methylation and assembled an allele-specific CRISPR/Cas9 targeting map. We demonstrate how deeper insights into genomic regulatory complexity are gained by adopting a genome-integrated framework.


Assuntos
Mapeamento Cromossômico/métodos , Genoma Humano , Genômica/métodos , Haplótipos , Análise de Sequência de DNA/estatística & dados numéricos , Alelos , Aneuploidia , Metilação de DNA , Variação Estrutural do Genoma , Células Hep G2 , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Mutação INDEL , Cariotipagem , Perda de Heterozigosidade , Polimorfismo de Nucleotídeo Único , Retroelementos
11.
Genome Res ; 29(3): 472-484, 2019 03.
Artigo em Inglês | MEDLINE | ID: mdl-30737237

RESUMO

K562 is widely used in biomedical research. It is one of three tier-one cell lines of ENCODE and also most commonly used for large-scale CRISPR/Cas9 screens. Although its functional genomic and epigenomic characteristics have been extensively studied, its genome sequence and genomic structural features have never been comprehensively analyzed. Such information is essential for the correct interpretation and understanding of the vast troves of existing functional genomics and epigenomics data for K562. We performed and integrated deep-coverage whole-genome (short-insert), mate-pair, and linked-read sequencing as well as karyotyping and array CGH analysis to identify a wide spectrum of genome characteristics in K562: copy numbers (CN) of aneuploid chromosome segments at high-resolution, SNVs and indels (both corrected for CN in aneuploid regions), loss of heterozygosity, megabase-scale phased haplotypes often spanning entire chromosome arms, structural variants (SVs), including small and large-scale complex SVs and nonreference retrotransposon insertions. Many SVs were phased, assembled, and experimentally validated. We identified multiple allele-specific deletions and duplications within the tumor suppressor gene FHIT Taking aneuploidy into account, we reanalyzed K562 RNA-seq and whole-genome bisulfite sequencing data for allele-specific expression and allele-specific DNA methylation. We also show examples of how deeper insights into regulatory complexity are gained by integrating genomic variant information and structural context with functional genomics and epigenomics data. Furthermore, using K562 haplotype information, we produced an allele-specific CRISPR targeting map. This comprehensive whole-genome analysis serves as a resource for future studies that utilize K562 as well as a framework for the analysis of other cancer genomes.


Assuntos
Genoma Humano , Humanos , Células K562 , Cariótipo , Polimorfismo Genético , Sequenciamento Completo do Genoma
12.
J Pathol ; 247(2): 199-213, 2019 02.
Artigo em Inglês | MEDLINE | ID: mdl-30350422

RESUMO

Variable tumor cellularity can limit sensitivity and precision in comparative genomics because differences in tumor content can result in misclassifying truncal mutations as region-specific private mutations in stroma-rich regions, especially when studying tissue specimens of mediocre tumor cellularity such as lung adenocarcinomas (LUADs). To address this issue, we refined a nuclei flow-sorting approach by sorting nuclei based on ploidy and the LUAD lineage marker thyroid transcription factor 1 and applied this method to investigate genome-wide somatic copy number aberrations (SCNAs) and mutations of 409 cancer genes in 39 tumor populations obtained from 16 primary tumors and 21 matched metastases. This approach increased the mean tumor purity from 54% (range 7-89%) of unsorted material to 92% (range 79-99%) after sorting. Despite this rise in tumor purity, we detected limited genetic heterogeneity between primary tumors and their metastases. In fact, 88% of SCNAs and 80% of mutations were propagated from primary tumors to metastases and low allele frequency mutations accounted for much of the mutational heterogeneity. Even though the presence of SCNAs indicated a history of chromosomal instability (CIN) in all tumors, metastases did not have more SCNAs than primary tumors. Moreover, tumors with biallelic TP53 or ATM mutations had high numbers of SCNAs, yet they were associated with a low interlesional genetic heterogeneity. The results of our study thus provide evidence that most macroevolutionary events occur in primary tumors before metastatic dissemination and advocate for a limited degree of CIN over time and space in this cohort of LUADs. Sampling of primary tumors thus may suffice to detect most mutations and SCNAs. In addition, metastases but not primary tumors had seeded additional metastases in three of four patients; this provides a genomic rational for surgical treatment of such oligometastatic LUADs. Copyright © 2018 Pathological Society of Great Britain and Ireland. Published by John Wiley & Sons, Ltd.


Assuntos
Adenocarcinoma de Pulmão/genética , Adenocarcinoma de Pulmão/secundário , Biomarcadores Tumorais/genética , Separação Celular/métodos , Citometria de Fluxo , Heterogeneidade Genética , Neoplasias Pulmonares/genética , Neoplasias Pulmonares/patologia , Adulto , Hibridização Genômica Comparativa , Variações do Número de Cópias de DNA , Feminino , Dosagem de Genes , Predisposição Genética para Doença , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Masculino , Pessoa de Meia-Idade , Mutação , Taxa de Mutação , Fenótipo , Estudos Retrospectivos , Análise Espaço-Temporal
13.
Nucleic Acids Res ; 45(19): e162, 2017 Nov 02.
Artigo em Inglês | MEDLINE | ID: mdl-28977555

RESUMO

Genomic instability is a frequently occurring feature of cancer that involves large-scale structural alterations. These somatic changes in chromosome structure include duplication of entire chromosome arms and aneuploidy where chromosomes are duplicated beyond normal diploid content. However, the accurate determination of aneuploidy events in cancer genomes is a challenge. Recent advances in sequencing technology allow the characterization of haplotypes that extend megabases along the human genome using high molecular weight (HMW) DNA. For this study, we employed a library preparation method in which sequence reads have barcodes linked to single HMW DNA molecules. Barcode-linked reads are used to generate extended haplotypes on the order of megabases. We developed a method that leverages haplotypes to identify chromosomal segmental alterations in cancer and uses this information to join haplotypes together, thus extending the range of phased variants. With this approach, we identified mega-haplotypes that encompass entire chromosome arms. We characterized the chromosomal arm changes and aneuploidy events in a manner that offers similar information as a traditional karyotype but with the benefit of DNA sequence resolution. We applied this approach to characterize aneuploidy and chromosomal alterations from a series of primary colorectal cancers.


Assuntos
Aneuploidia , Haplótipos , Neoplasias/genética , Aberrações Cromossômicas , Neoplasias Colorretais/diagnóstico , Neoplasias Colorretais/genética , Análise Mutacional de DNA/métodos , Genoma Humano/genética , Instabilidade Genômica , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Humanos , Cariótipo , Cariotipagem/métodos , Neoplasias/diagnóstico , Reprodutibilidade dos Testes , Sensibilidade e Especificidade
14.
Genome Med ; 9(1): 57, 2017 06 19.
Artigo em Inglês | MEDLINE | ID: mdl-28629429

RESUMO

BACKGROUND: Genome rearrangements are critical oncogenic driver events in many malignancies. However, the identification and resolution of the structure of cancer genomic rearrangements remain challenging even with whole genome sequencing. METHODS: To identify oncogenic genomic rearrangements and resolve their structure, we analyzed linked read sequencing. This approach relies on a microfluidic droplet technology to produce libraries derived from single, high molecular weight DNA molecules, 50 kb in size or greater. After sequencing, the barcoded sequence reads provide long range genomic information, identify individual high molecular weight DNA molecules, determine the haplotype context of genetic variants that occur across contiguous megabase-length segments of the genome and delineate the structure of complex rearrangements. We applied linked read sequencing of whole genomes to the analysis of a set of synchronous metastatic diffuse gastric cancers that occurred in the same individual. RESULTS: When comparing metastatic sites, our analysis implicated a complex somatic rearrangement that was present in the metastatic tumor. The oncogenic event associated with the identified complex rearrangement resulted in an amplification of the known cancer driver gene FGFR2. With further investigation using these linked read data, the FGFR2 copy number alteration was determined to be a deletion-inversion motif that underwent tandem duplication, with unique breakpoints in each metastasis. Using a three-dimensional organoid tissue model, we functionally validated the metastatic potential of an FGFR2 amplification in gastric cancer. CONCLUSIONS: Our study demonstrates that linked read sequencing is useful in characterizing oncogenic rearrangements in cancer metastasis.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala/métodos , Mutação , Metástase Neoplásica , Análise de Sequência de DNA/métodos , Neoplasias Gástricas/genética , Genoma Humano , Genômica/métodos , Haplótipos , Humanos , Neoplasias Gástricas/patologia
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...